Purpose

This document elucidates some of the remote sensing satellite options for creating a rasterized time series of presence/absence of sediment plumes and algal blooms within the western portion of Lake Superior. This summary is limited to the Landsat constellation, Moderate Resolution Imaging Spectroradiometer (MODIS)/Visible Infrared Imaging Radiometer Suite (VIIRS), and Harmonized Landsat-Sentinel (HLS) as they provide 3 spatial and temporal options for raster development. While these are by no means all of the satellite or satellite products available to use for this purpose, these options will inform our decisions for image processing pipeline development. This document summarizes the benefits and trade offs of each of the satellite-derived data products as well as an estimate of what the deliverable product might look like. We will need your feedback to move forward with development.

Background

Below is the area of interest (AOI, green-shaded area) and observed blooms (yellow pins), dating back to 1995 (data provided by Kate Reinl and should not be made public at this time). The observations have been limited to those within our AOI.

Figure 1. Area of interest and observations of blooms from the Algal Bloom and Nutrient Subgroup of the Lake Superior Partnership. Do not distribute this image publicly at this time.

Time and Scale

There is an inherent tradeoff between satellite image overpass frequency and the resolution (pixel size) of the satellite sensors (Table 1). Satellites with more frequent overpasses will have a larger pixel size. Note the differences in data access between sources. HLS data are not yet available on Google Earth Engine (GEE) and require a different acquisition and processing pipeline. Additionally, while VIIRS is listed here, we do not provide a summary of VIIRS data for availability. VIIRS is the ‘next generation’ of the MODIS data and unless we are going to develop the pipeline on MODIS data, we will not use the VIIRS dataset. We would expect VIIRS to have a similar data inventory to MODIS as described later in the text.

Table 1. Satellite background information. a LS7 is currently in the process of being decommissioned. While we are still seeing LS7 images through 2022, they have started to become unreliable due to the proximity of the satellite to the earth. b Data are resampled to a 30x30m pixel, even if the resolution is larger. c There are two sensor groups on the VIIRS satellite with differing resolutions. d Data become more sparse in January 2021 - NASA is not processing all available tiles but can upon request. It is unclear what the timeline for such a request would be.
Satellite/Product Years of Data Availability Overpass Frequency Resolution Data Access
Landsat 4, 5, 7, 8, 9

LS4: 1983-2001

LS5: 1984-2013

LS7: 1999-2021a

LS8: 2013-current

LS9: 2021-current

16 days per satellite mission 30m2b Google Earth Engine
MODIS (Aqua/Terra)

Aqua: 2000-current

Terra: 2002-current

~ 2 days per satellite

a processed ‘daily’ product available beginning 2002

500m2 Google Earth Engine
VIIRS 2011-current daily 375 & 750m2c Google Earth Engine
Harmonized Landsat-Sentinel Product 2013-currentd 2-3 days 30m2 LP DAAC

AOI-Satellite Tile Overlays

Satellite images are processed as tiles. Each of the satellite constellations use different grid systems. To get an idea of the satellite tile area relative to the AOI, below are images of each satellite’s tile outline (red) and our AOI (green-shaded polygon).

Figure 2. Landsat tiles over western Lake Superior. Green-shaded area is AOI. Pictured are World Reference System tiles path 26, rows 27 and 28 in red.

Figure 3. MODIS/VIIRS tile over western Lake Superior. Green-shaded area is AOI. AOI falls wholly within the h11v4 MODIS/VIIRS tile.

Figure 4. HLS/Sentinel tiles over western Lake Superior. Green-shaded area is AOI. Tiles pictured are 15TWN, 15TXN, 15TYM, 15TWM, 15TXM, 15TYM, left to right, top to bottom.

Image Inventories

Despite the return frequencies of each of the satellite constellations, the availability of images is defined by cloud cover. Given this, we have summarized the availability of images per satellite constellation/product based on the total AOI. We use a different method to aggregate HLS data versus Landsat and MODIS data due to how they are accessed. Additional context:

For each of the summary periods below (weekly, monthly), the first figure provides an estimate of the minimum proportion of the AOI that we would be able to rasterize into open water-plume-bloom. The second figure gives an idea of the likelihood that we can ‘fill in the gaps’ from the first figure. The higher the number of images available, the more likely we will be to fill out the AOI with rasterized open water-plume-bloom information.

Weekly summaries

Given the data available, when summarized to a weekly time series, MODIS is the only satellite with robust enough data to build a consistent output product at this timestep (Figure 5, 6). Ideally we want to have a high ‘maximum proportional area’ (Figure 5) and a high ‘number of images above threshold’ (Figure 6). These data have been limited to the dates of April through October (inclusive).

Figure 5. Indication of maximum proportional area without clouds in a single satellite-derived image of our AOI per week by satellite product. Because of the high return frequency of MODIS, we have more images available to retrieve complete coverage of our AOI, which results in a greater maximum proportional area value (more yellow tiles). Note that the data gap frequency and lenght in HLS coverage begins in 2021.

Here we summarize the number of images available per week that surpass our proportional quantity threshold of at least 40% of our AOI available for Landsat and MODIS or 30% for our estimate for HLS for analysis.

Figure 6. The number of images above described proportional quantity threshold per week per data source. Many of the HLS and Landsat aggregations have only one or two images above the threshold, which would limit our ability to create an open water-plume-bloom raster on the time step of one week, while MODIS often has between 6-10 images to work with in a given week.

Monthly Summaries

When the data are aggregated on a monthly basis, our options for using other satellite-derived images to produce a more consistent raster increases beyond the MODIS data.

Figure 7. Indication of maximum proportional area without clouds in a single satellite-derived image of our AOI per month by satellite product. Aggregating to a monthly time step would allow us to use any of the data products, though HLS- and Landsat-derived rasters are likely to have some spatial gaps given the maximum proportional area.

Figure 8. The number of images above the proportional quantity threshold per week by satellite product. Note the increase in Landsat scenes beginning in 2001 and the frequency of instances where HLS and Landsat have no images above the proportional quantity threshold.

Notes on data aggregation:

We will have to consider a variety of aggregation techniques and enumerate uncertainties in our final output product. Here are some general notes/thoughts to keep in mind as we move forward:

  • We assume the data ‘subset’ (i.e. the available images for any time period) is representative of actual conditions throughout the entire time period of aggregation. Because return frequency is static and cloud cover is stochastic, we can probably assert this without anyone getting too upset.

  • Any area of our AOI that has fewer stacked images compared to other areas during any aggregated time period may have a greater uncertainty associated with their labels. This may be especially true for those areas where we have labeled “open water” values. There is likely some threshold where there is no impact to uncertainty relative to the number of images included in any raster sub-area.

Potential Raster Output

Landsat:

With data dating back to the 1980s, this is the optimal data source for time series over many decades; however, in order to create a reliable and complete raster data set the data need to be aggregated to a monthly time step. Although the pixel resolution is sufficient for identification of floating-mat blooms, we may miss shoulder-season events in April/May and September/October due to limited images above our proportional quantity threshold (Figure 8). The output of a pipeline built on Landsat would likely be a monthly raster of 50-100m pixels from 1985-2022.

MODIS:

MODIS data availability and extent is the most optimal of the three satellite constellations shown here. The major drawback is resolvable pixel size (500-1000 m2). For our particular research question, we may miss blooms completely due to this. The output of a pipeline built on MODIS imagery would likely be a weekly raster of 500m pixels from 2000-2022.

HLS:

A big part of the novelty of the HLS data acquisition and analysis pipeline is using STAC (the Spatialtemporal Asset Catalog) to request, process, and gather images and we have not yet built out this pipeline beyond taking inventory of the available data. This product shares many of the drawbacks listed in the Landsat section above, though the more frequent gaps in data (represented by lower counts of images in Figure 8) may mean we ‘miss’ bloom events, particularly in 2021-2022 when the data become more limited (which is clear in Figure 5). The output of a pipeline based on HLS data would likely be a monthly raster of a 50-100m resolution at a slightly higher or uncertainty than the Landsat output. The HLS-derived raster would be limited to the time period 2013-2022.

Next Steps

Given the constraints, caveats, and tradeoffs described above, we now need direction on what the raster output should look like to help guide in the creation of a data processing pipeline. This includes feedback on the temporal time step, the optimal raster pixel size, and the initial satellite product to develop the process.

To be thorough, we will likely try to extract images for all of the described satellites above where there are overlapping- and/or neighboring-by-date observations of algal blooms across our AOI (as shown in Figure 1). This will help us investigate how we can incorporate the observed dates of blooms from Kate in our pipeline. If there is insufficient data to inform an algorithm of this nature, we will pivot to using established algorithms for estimating chlorophyll-a from image bandmath. We will be working on this the week prior to your arrival at CSU to help inform our discussions together.

While there are clustering techniques built within GEE, the techniques are limited as well as any ability to enumerate uncertainty. In order to process either STAC- or GEE-acquired image collections with clustering techniques and image segmentation, all pipelines will require hardware for local storage and computing power.